AITopics | online phase

Country:

North America > United States > California > Santa Clara County > Stanford (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)
Information Technology > Data Science > Data Mining > Big Data (0.46)

Neural Information Processing SystemsFeb-10-2026, 09:42:58 GMT

8433bb4f7477bf8202614ce1ae8b1169-Supplemental-Conference.pdf

assumption, online phase, rfo live, (13 more...)

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)

Neural Information Processing SystemsFeb-10-2026, 09:42:55 GMT

8433bb4f7477bf8202614ce1ae8b1169-Paper-Conference.pdf

assumption, international conference, rfo live, (12 more...)

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.48)

Kalanther, Addison, Bostwick, Daniel, Maheshwari, Chinmay, Sastry, Shankar

Evader-Agnostic Team-Based Pursuit Strategies in Partially-Observable Environments

arXiv.org Artificial IntelligenceNov-13-2025

We consider a scenario where a team of two unmanned aerial vehicles (UAVs) pursue an evader UAV within an urban environment. Each agent has a limited view of their environment where buildings can occlude their field-of-view. Additionally, the pursuer team is agnostic about the evader in terms of its initial and final location, and the behavior of the evader. Consequently, the team needs to gather information by searching the environment and then track it to eventually intercept. To solve this multi-player, partially-observable, pursuit-evasion game, we develop a two-phase neuro-symbolic algorithm centered around the principle of bounded rationality. First, we devise an offline approach using deep reinforcement learning to progressively train adversarial policies for the pursuer team against fictitious evaders. This creates $k$-levels of rationality for each agent in preparation for the online phase. Then, we employ an online classification algorithm to determine a "best guess" of our current opponent from the set of iteratively-trained strategic agents and apply the best player response. Using this schema, we improved average performance when facing a random evader in our environment.

evader, machine learning, reinforcement learning, (15 more...)

2511.05812

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Asia > China > Heilongjiang Province > Daqing (0.04)

Genre: Research Report (0.40)

Industry:

Education > Educational Setting > Online (0.57)
Leisure & Entertainment > Games (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.72)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.56)

arXiv.org Artificial IntelligenceNov-5-2025

PrivGNN: High-Performance Secure Inference for Cryptographic Graph Neural Networks

Wang, Fuyi, Chen, Zekai, Fan, Mingyuan, Zhou, Jianying, Pan, Lei, Zhang, Leo Yu

Graph neural networks (GNNs) are powerful tools for analyzing and learning from graph-structured (GS) data, facilitating a wide range of services. Deploying such services in privacy-critical cloud environments necessitates the development of secure inference (SI) protocols that safeguard sensitive GS data. However, existing SI solutions largely focus on convolutional models for image and text data, leaving the challenge of securing GNNs and GS data relatively underexplored. In this work, we design, implement, and evaluate $\sysname$, a lightweight cryptographic scheme for graph-centric inference in the cloud. By hybridizing additive and function secret sharings within secure two-party computation (2PC), $\sysname$ is carefully designed based on a series of novel 2PC interactive protocols that achieve $1.5\times \sim 1.7\times$ speedups for linear layers and $2\times \sim 15\times$ for non-linear layers over state-of-the-art (SotA) solutions. A thorough theoretical analysis is provided to prove $\sysname$'s correctness, security, and lightweight nature. Extensive experiments across four datasets demonstrate $\sysname$'s superior efficiency with $1.3\times \sim 4.7\times$ faster secure predictions while maintaining accuracy comparable to plaintext graph property inference.

artificial intelligence, machine learning, privgnn, (18 more...)

2511.02185

Country:

Oceania > Australia (0.04)
Asia > Singapore (0.04)
Asia > China > Fujian Province > Fuzhou (0.04)
(3 more...)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Owfi, Ali, Bamdad, Amirmohammad, Seyfi, Tolunay, Afghah, Fatemeh

Adapt under Attack and Domain Shift: Unified Adversarial Meta-Learning and Domain Adaptation for Robust Automatic Modulation Classification

arXiv.org Artificial IntelligenceNov-4-2025

Deep learning has emerged as a leading approach for Automatic Modulation Classification (AMC), demonstrating superior performance over traditional methods. However, vulnerability to adversarial attacks and susceptibility to data distribution shifts hinder their practical deployment in real-world, dynamic environments. To address these threats, we propose a novel, unified framework that integrates meta-learning with domain adaptation, making AMC systems resistant to both adversarial attacks and environmental changes. Our framework utilizes a two-phase strategy. First, in an offline phase, we employ a meta-learning approach to train the model on clean and adversarially perturbed samples from a single source domain. This method enables the model to generalize its defense, making it resistant to a combination of previously unseen attacks. Subsequently, in the online phase, we apply domain adaptation to align the model's features with a new target domain, allowing it to adapt without requiring substantial labeled data. As a result, our framework achieves a significant improvement in modulation classification accuracy against these combined threats, offering a critical solution to the deployment and operational challenges of modern AMC systems.

amc model, artificial intelligence, machine learning, (15 more...)

2511.01172

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Government (0.90)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Reboul, Sebastian, Halconruy, Hélène, Douc, Randal

Learning Upper Lower Value Envelopes to Shape Online RL: A Principled Approach

arXiv.org Machine LearningOct-23-2025

We investigate the fundamental problem of leveraging offline data to accelerate online reinforcement learning - a direction with strong potential but limited theoretical grounding. Our study centers on how to learn and apply value envelopes within this context. To this end, we introduce a principled two-stage framework: the first stage uses offline data to derive upper and lower bounds on value functions, while the second incorporates these learned bounds into online algorithms. Our method extends prior work by decoupling the upper and lower bounds, enabling more flexible and tighter approximations. In contrast to approaches that rely on fixed shaping functions, our envelopes are data-driven and explicitly modeled as random variables, with a filtration argument ensuring independence across phases. The analysis establishes high-probability regret bounds determined by two interpretable quantities, thereby providing a formal bridge between offline pre-training and online fine-tuning. Empirical results on tabular MDPs demonstrate substantial regret reductions compared with both UCBVI and prior methods.

artificial intelligence, machine learning, reinforcement learning, (11 more...)

arXiv.org Machine Learning

2510.19528

Country:

North America > United States (0.04)
Europe > France > Île-de-France > Hauts-de-Seine > Nanterre (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Neural Information Processing SystemsOct-2-2025, 02:41:58 GMT

1019c8091693ef5c5f55970346633f92-AuthorFeedback.pdf

We thank all the reviewers for their feedback. We will incorporate them in the future version. Overall, reviewers agree that our paper is technically sound. "the learner knows the logging policy": This is a fairly standard assumption in prior work [1,5,19,22] and holds for Note that this fitting can be done from unlabeled data only. "the number of unlabeled examples is at most the size of logged data set": We look at this setting mostly for simplicity, Our results are in line with a long line of prior work on active learning theory.

active learning, artificial intelligence, machine learning, (13 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Manor, Shalev, Kohandel, Mohammad

IP-Basis PINNs: Efficient Multi-Query Inverse Parameter Estimation

arXiv.org Artificial IntelligenceSep-10-2025

Solving inverse problems with Physics-Informed Neural Networks (PINNs) is computationally expensive for multi-query scenarios, as each new set of observed data requires a new, expensive training procedure. We present Inverse-Parameter Basis PINNs (IP-Basis PINNs), a meta-learning framework that extends the foundational work of Desai et al. (2022) to enable rapid and efficient inference for inverse problems. Our method employs an offline-online decomposition: a deep network is first trained offline to produce a rich set of basis functions that span the solution space of a parametric differential equation. For each new inverse problem online, this network is frozen, and solutions and parameters are inferred by training only a lightweight linear output layer against observed data. Key innovations that make our approach effective for inverse problems include: (1) a novel online loss formulation for simultaneous solution reconstruction and parameter identification, (2) a significant reduction in computational overhead via forward-mode automatic differentiation for PDE loss evaluation, and (3) a non-trivial validation and early-stopping mechanism for robust offline training. We demonstrate the efficacy of IP-Basis PINNs on three diverse benchmarks, including an extension to universal PINNs for unknown functional terms-showing consistent performance across constant and functional parameter estimation, a significant speedup per query over standard PINNs, and robust operation with scarce and noisy data.

artificial intelligence, machine learning, pinn, (18 more...)

2509.07245

Country: North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.32)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsAug-16-2025, 14:01:47 GMT

On the Statistical Efficiency of Reward-Free Exploration in Non-Linear RL

Our analyses indicate that the explorability or reachability assumptions, previously made for the latter two settings, are not necessary statistically for reward-free exploration.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

Country:

North America > United States > Illinois > Champaign County > Urbana (0.04)
North America > United States > Michigan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.46)